{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MwTWzDxYgbrR"
   },
   "source": [
    "# Athena\n",
    "\n",
    ">[Amazon Athena](https://aws.amazon.com/athena/) is a serverless, interactive analytics service built\n",
    ">on open-source frameworks, supporting open-table and file formats. `Athena` provides a simplified,\n",
    ">flexible way to analyze petabytes of data where it lives. Analyze data or build applications\n",
    ">from an Amazon Simple Storage Service (S3) data lake and 30 data sources, including on-premises data\n",
    ">sources or other cloud systems using SQL or Python. `Athena` is built on open-source `Trino`\n",
    ">and `Presto` engines and `Apache Spark` frameworks, with no provisioning or configuration effort required.\n",
    "\n",
    "This notebook goes over how to load documents from `AWS Athena`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting up\n",
    "\n",
    "Follow [instructions to set up an AWS accoung](https://docs.aws.amazon.com/athena/latest/ug/setting-up.html).\n",
    "\n",
    "Install a python library:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "F0zaLR3xgWmO"
   },
   "outputs": [],
   "source": [
    "! pip install boto3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "076NLjfngoWJ"
   },
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders.athena import AthenaLoader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "XpMRQwU9gu44"
   },
   "outputs": [],
   "source": [
    "database_name = \"my_database\"\n",
    "s3_output_path = \"s3://my_bucket/query_results/\"\n",
    "query = \"SELECT * FROM my_table\"\n",
    "profile_name = \"my_profile\"\n",
    "\n",
    "loader = AthenaLoader(\n",
    "    query=query,\n",
    "    database=database_name,\n",
    "    s3_output_uri=s3_output_path,\n",
    "    profile_name=profile_name,\n",
    ")\n",
    "\n",
    "documents = loader.load()\n",
    "print(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "5IBapL3ejoEt"
   },
   "source": [
    "Example with metadata columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "wMx6nI1qjryD"
   },
   "outputs": [],
   "source": [
    "database_name = \"my_database\"\n",
    "s3_output_path = \"s3://my_bucket/query_results/\"\n",
    "query = \"SELECT * FROM my_table\"\n",
    "profile_name = \"my_profile\"\n",
    "metadata_columns = [\"_row\", \"_created_at\"]\n",
    "\n",
    "loader = AthenaLoader(\n",
    "    query=query,\n",
    "    database=database_name,\n",
    "    s3_output_uri=s3_output_path,\n",
    "    profile_name=profile_name,\n",
    "    metadata_columns=metadata_columns,\n",
    ")\n",
    "\n",
    "documents = loader.load()\n",
    "print(documents)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
